The Battle for Data Engineer’s Favorite Programming Language Is Not Over Yet
Let's discuss the next contender for 2022
Let's discuss the next contender for 2022

Picking the right programming language for the right job can be challenging as the technology moves fast, and so many frameworks are popping up.
Let's break down these past years and understand data engineers' current programming language ecosystem and the ideal candidate for 2022. Could Scala, Golang, or Rust be our next favorites? Let's find out.
The Early Days of Scripting and SQL 💾
When Python was still for most of us only a snake, Perl was quite well used, like other scripting languages.
Perl was originally developed for text processing, like extracting the required information from a specified text file and converting the text file into a different form. It was well-fitted for data purposes.
Perl was an excellent way to interface with SQL databases, which was the standard back then. Remember, no cloud and no API dominance era as we have today.
It's important to note that what mainly was powering Perl in data use cases was SQL. Perl was used to perform these SQL commands against databases.
Python’s Reign of Dominance 🐍
Today, data engineer's job is not only interacting with SQL databases and running queries but also the following:
Managing infrastructure (through infrastructure as a code with frameworks like Terraform/Pulumi)
Developing data pipelines
Developing microservices/API/data frameworks
Interacting with cloud services SDKs
When moving to Big Data, we saw a lot of usage of Java. And it's still there either as a behind the scene or first development API (👋 Trino, Flink, Akka, etc.).
Scala tried to rise (with Spark from Databricks), but even if it's more performant at scale and probably a better suitable language for data pipeline, it lacks more significant adoption outside Spark use-case.
Databricks reported that most of their API calls are done through Python and SQL, forcing them to provide similar performances on Python binding — another downfall for Scala?
Python has massive adoption today, and here's why :
The learning curve for new programmer folks is pretty easy (notebooks help a lot).
The data science ecosystem: machine learning, visualization, deep learning.
Cloud adoption: all major cloud providers have a well-supported Python SDK.
Is there something that you can't do in Python?
The Brighter Future and Rust’s Potential ✨
SQL is here to stay for a while. Even with its limits, it's a low technical entry point to democratize data usage in general, and it's still the easiest way to interact with an SQL/analytical database.
Golang seemed to be a good competitor. Terraform and Kubernetes have massive adoption, and both are written in Golang. It's also designed and supported by a major cloud provider: Google.
That being said, there aren't that many data frameworks built around Golang. The learning curve is also a significant barrier to catch-up for Python's data average users.
Who would be the next candidate then? Rust. Here are four non-exhaustive reasons:
1. General popularity
According to a Stack Overflow study, Rust has been the most loved programming language for four years in a row!
Google trends also show a steady growth of Rust and general fatigue for Python.
Another big news is that Rust will be the second language of the official Linux Kernel! And that will gain insane traction.
2. Performance and low-memory footprint
It's not big news that Python can be slow. Rust's performance is at another level because it's compiled directly into machine code. No virtual machine, no interpreter sitting between your code and the computer.
In our cloud computing era, the footprint of your program on your compute system is directly impacting your costs, but also the electricity usage and therefore the impact on the environment.
An interesting study by the New Stack revealed which programming language consumes the least electricity. Rust is at the top of that list, while Python… at the bottom.
3. Interoperability with Python
What if you could rewrite some part of your existing Python code base and still use it through your main Python program? That's combining the best of the two worlds.
A concrete use case would be to perform specific actions against s3 files, which can be pretty slow in Python. With AWS announcing recently their AWS SDK in Rust in developer preview, this is something you could perform in Rust. Using a Rust binding for a Python library like PyO3 enables you to quickly do a simple interface to call your Rust program within Python!
Even Microsoft published a windows crate that enables you to access Win32 API’s from Rust!
4. A lot of data projects are being rebuilt in Rust
Apache Arrow is a key common interface to build data processing frameworks. It has a great Rust implementation, and it’s pushing other data projects to rise:
Spark's Rust equivalent called data fusion
Delta Lake has a native Rust interface with binding in Python and Ruby.
Other big players like Confluent Kafka offer now a Rust binding.
There are many new projects to handle data. It's still in the early stages, but since adoption is growing, we could even see Java no longer be the default choice.
Is It Worth It, Though? 🤔
Initially, both Rust and Python were built with different goals. The learning curve is steeper for Rust, and it will be difficult for some data citizens (data scientists, data analysts) to jump on the boat. You are making a trade-off between performance and simplicity.
The data engineer role evolves more strongly as a devops/backend engineer rather than just the “SQL person.” It makes sense to try out Rust for some use cases in that context. Rust's mindset is also valuable for any future programming language you would learn next. If you want to get your hands dirty, one of my favorite resources for Rust is the YouTube channel Let's Get Rusty.
In the very end, programming languages are just part of your toolbelt, and it doesn't hurt to have more than one, especially when you see that the data engineer scope is expanding exponentially lately! 🚀
Want to Connect With the Author?
Follow me on 🎥 YouTube,🔗 LinkedIn