Job Description
Key responsibilities:
Design, build, and maintain scalable data pipelines on Databricks (ETL/ELT workflows)
Develop high-quality, efficient code using Python (PySpark) and SQL for data transformation and processing
Optimize data pipelines and Spark workloads for performance, scalability, and reliability
Work with large-scale structured and unstructured datasets using Apache Spark and Delta Lake
Collaborate with analysts, and stakeholders to deliver data-driven solutions
Required Skills & Experience:
Strong hands-on experience with Databricks and Apache Spark
Advanced coding skills in Python (PySpark) and SQL (mandatory)
Experience building data pipelines, ETL/ELT processes, and data models
Good understanding of data warehousing and Lakehouse architectures
Good understanding of PowerBI or other visualisation tools
Familiarity with cloud platforms (Azure / AWS ...